26 research outputs found

    Digital forensic readiness intelligence crime repository

    Get PDF
    It may not always be possible to conduct a digital (forensic) investigation post-event if there is no process in place to preserve potential digital evidence. This study posits the importance of digital forensic readiness, or forensic-by-design, and presents an approach that can be used to construct a Digital Forensic Readiness Intelligence Repository (DFRIR). Based on the concept of knowledge sharing, the authors leverage this premise to suggest an intelligence repository. Such a repository can be used to cross-reference potential digital evidence (PDE) sources that may help digital investigators during the process. This approach employs a technique of capturing PDE from different sources and creating a DFR repository that can be able to be shared across diverse jurisdictions among digital forensic experts and law enforcement agencies (LEAs), in the form of intelligence. To validate the approach, the study has employed a qualitative approach based on a number of metrics and an analysis of experts\u27 opinion has been incorporated. The DFRIR seeks to maximize the collection of PDE, and reducing the time needed to conduct forensic investigation (e.g., by reducing the time for learning). This study then explains how such an approach can be employed in conjunction with ISO/IEC 27043: 2015

    Do independent boards pay attention to corporate sustainability? Gender diversity can make a difference

    Get PDF
    Purpose Using the attention-based view, this paper aims to examine whether and how board composition might influence the allocation of board attention to corporate sustainability. Design/methodology/approach This is a conceptual paper that uses a theoretical perspective pointing to the importance of generating a board composition that might benefit both business case framing and paradoxical framing, a typology introduced in managerial cognition literature to explain managerial decision-making. Findings The conclusions emerging from the reviewed literature suggest that boards that have realized an independence of perspective focus on shareholder profit maximization at the expense of considerations of corporate sustainability. It emerges that women directors who have adopted paradoxical framing can enable boards to consider not only economic but also environmental and social issues of sustainability during board decision-making. Further, it is noted that the effect of gender diversity on allocation of board attention to corporate sustainability is contingent upon contextual (board openness) and structural (chairperson leadership) factors that facilitate social interactions inside boardrooms. Originality/value By considering alternative cognitive frames as well as social interactions, the propositions contribute to a better understanding of the allocation of board attention regarding ambiguous sustainability issues

    FedCSD: A Federated Learning Based Approach for Code-Smell Detection

    Full text link
    This paper proposes a Federated Learning Code Smell Detection (FedCSD) approach that allows organizations to collaboratively train federated ML models while preserving their data privacy. These assertions have been supported by three experiments that have significantly leveraged three manually validated datasets aimed at detecting and examining different code smell scenarios. In experiment 1, which was concerned with a centralized training experiment, dataset two achieved the lowest accuracy (92.30%) with fewer smells, while datasets one and three achieved the highest accuracy with a slight difference (98.90% and 99.5%, respectively). This was followed by experiment 2, which was concerned with cross-evaluation, where each ML model was trained using one dataset, which was then evaluated over the other two datasets. Results from this experiment show a significant drop in the model's accuracy (lowest accuracy: 63.80\%) where fewer smells exist in the training dataset, which has a noticeable reflection (technical debt) on the model's performance. Finally, the last and third experiments evaluate our approach by splitting the dataset into 10 companies. The ML model was trained on the company's site, then all model-updated weights were transferred to the server. Ultimately, an accuracy of 98.34% was achieved by the global model that has been trained using 10 companies for 100 training rounds. The results reveal a slight difference in the global model's accuracy compared to the highest accuracy of the centralized model, which can be ignored in favour of the global model's comprehensive knowledge, lower training cost, preservation of data privacy, and avoidance of the technical debt problem.Comment: 17 pages, 7 figures, Journal pape

    An extensive experimental survey of regression methods

    Get PDF
    Regression is a very relevant problem in machine learning, with many different available approaches. The current work presents a comparison of a large collection composed by 77 popular regression models which belong to 19 families: linear and generalized linear models, generalized additive models, least squares, projection methods, LASSO and ridge regression, Bayesian models, Gaussian processes, quantile regression, nearest neighbors, regression trees and rules, random forests, bagging and boosting, neural networks, deep learning and support vector regression. These methods are evaluated using all the regression datasets of the UCI machine learning repository (83 datasets), with some exceptions due to technical reasons. The experimental work identifies several outstanding regression models: the M5 rule-based model with corrections based on nearest neighbors (cubist), the gradient boosted machine (gbm), the boosting ensemble of regression trees (bstTree) and the M5 regression tree. Cubist achieves the best squared correlation (R2) in 15.7% of datasets being very near to it, with difference below 0.2 for 89.1% of datasets, and the median of these differences over the dataset collection is very low (0.0192), compared e.g. to the classical linear regression (0.150). However, cubist is slow and fails in several large datasets, while other similar regression models as M5 never fail and its difference to the best R2 is below 0.2 for 92.8% of datasets. Other well-performing regression models are the committee of neural networks (avNNet), extremely randomized regression trees (extraTrees, which achieves the best R2 in 33.7% of datasets), random forest (rf) and ε-support vector regression (svr), but they are slower and fail in several datasets. The fastest regression model is least angle regression lars, which is 70 and 2,115 times faster than M5 and cubist, respectively. The model which requires least memory is non-negative least squares (nnls), about 2 GB, similarly to cubist, while M5 requires about 8 GB. For 97.6% of datasets there is a regression model among the 10 bests which is very near (difference below 0.1) to the best R2, which increases to 100% allowing differences of 0.2. Therefore, provided that our dataset and model collection are representative enough, the main conclusion of this study is that, for a new regression problem, some model in our top-10 should achieve R2 near to the best attainable for that problemThis work has received financial support from the Erasmus Mundus Euphrates programme [project number 2013-2540/001-001-EMA2], from the Xunta de Galicia (Centro singular de investigación de Galicia, accreditation 2016–2019) and the European Union (European Regional Development Fund — ERDF), Project MTM2016–76969–P (Spanish State Research Agency, AEI)co-funded by the European Regional Development Fund (ERDF) and IAP network from Belgian Science PolicyS

    TrustE-VC: Trustworthy Evaluation Framework for Industrial Connected Vehicles in the Cloud

    Get PDF
    The integration between cloud computing and vehicular ad hoc networks, namely, vehicular clouds (VCs), has become a significant research area. This integration was proposed to accelerate the adoption of intelligent transportation systems. The trustworthiness in VCs is expected to carry more computing capabilities that manage large-scale collected data. This trend requires a security evaluation framework that ensures data privacy protection, integrity of information, and availability of resources. To the best of our knowledge, this is the first study that proposes a robust trustworthiness evaluation of vehicular cloud for security criteria evaluation and selection. This article proposes three-level security features in order to develop effectiveness and trustworthiness in VCs. To assess and evaluate these security features, our evaluation framework consists of three main interconnected components: 1) an aggregation of the security evaluation values of the security criteria for each level; 2) a fuzzy multicriteria decision-making algorithm; and 3) a simple additive weight associated with the importance-performance analysis and performance rate to visualize the framework findings. The evaluation results of the security criteria based on the average performance rate and global weight suggest that data residency, data privacy, and data ownership are the most pressing challenges in assessing data protection in a VC environment. Overall, this article paves the way for a secure VC using an evaluation of effective security features and underscores directions and challenges facing the VC community. This article sheds light on the importance of security by design, emphasizing multiple layers of security when implementing industrial VCsThis work was supported in part by the Ministry of Education, Culture, and Sport, Government of Spain under Grant TIN2016-76373-P, in part by the Xunta de Galicia Accreditation 2016–2019 under Grant ED431G/08 and Grant ED431C 2018/2019, and in part by the European Union under the European Regional Development FundS

    Quantifying the need for supervised machine learning in conducting live forensic analysis of emergent configurations (ECO) in IoT environments

    Get PDF
    © 2020 The Author(s) Machine learning has been shown as a promising approach to mine larger datasets, such as those that comprise data from a broad range of Internet of Things devices, across complex environment(s) to solve different problems. This paper surveys existing literature on the potential of using supervised classical machine learning techniques, such as K-Nearest Neigbour, Support Vector Machines, Naive Bayes and Random Forest algorithms, in performing live digital forensics for different IoT configurations. There are also a number of challenges associated with the use of machine learning techniques, as discussed in this paper

    Digital forensic readiness in operational cloud leveraging ISO/IEC 27043 guidelines on security monitoring

    Get PDF
    An increase in the use of cloud computing technologies by organizations has led to cybercriminals targeting cloud environments to orchestrate malicious attacks. Conversely, this has led to the need for proactive approaches through the use of digital forensic readiness (DFR). Existing studies have attempted to develop proactive prototypes using diverse agent-based solutions that are capable of extracting a forensically sound potential digital evidence. As a way to address this limitation and further evaluate the degree of PDE relevance in an operational platform, this study sought to develop a prototype in an operational cloud environment to achieve DFR in the cloud. The prototype is deployed and executed in cloud instances hosted on OpenStack: the operational cloud environment. The experiments performed in this study show that it is viable to attain DFR in an operational cloud platform. Further observations show that the prototype is capable of harvesting digital data from cloud instances and store the data in a forensic sound database. The prototype also prepares the operational cloud environment to be forensically ready for digital forensic investigations without alternating the functionality of the OpenStack cloud architecture by leveraging the ISO/IEC 27043 guidelines on security monitoring.https://wileyonlinelibrary.com/journal/spy2Computer Scienc

    Machine learning algorithms for pattern visualization in classification tasks and for automatic indoor temperature prediction

    Get PDF
    This thesis explores aspects in the field of machine learning, and specifically of pattern classification and regression or function approximation. Although there are many methods of classification for multi-dimensional patterns, in general, they all behave like "black boxes" where the explanation of their operation is difficult or impossible. This thesis develops methods of reducing the dimensionality of data to project multi-dimensional classification problems over a two-dimensional space (a plane). The classifiers can thus be used to learn the projected data and to create two-dimensional maps of classification problems whose graphic nature makes intuitive and easy to understand, helping to explain the classification problem. After a review of the existing techniques for dimensionality reduction, several methods are proposed to project the multidimensional data on the plane, minimizing the overlap between classes. These methods allow to project new patterns not used during the projection learning process. Eight types of linear, quadratic and polynomial projections are proposed and combined with four overlapping measures between classes. These projections are compared with another 34 dimensionality reduction methods existing in the literature on a wide collection of 71 benchmark classification problems. The best results have been obtained by the Polynomial Kernel Discriminant Analysis of degree 2 (PKDA2), which creates visual and selfexplanatory maps of the classification problems on which a reference classifier (the support vector machine, or SVM) fails only slightly less than on the original multi-dimensional data. A web interface and a local standalone application are also provided, developed using the PHP and Matlab programming languages, respectively, which allow to apply these projections in order to visualize the 2D maps of any classification problem. In the scope of regression, a wide collection of regressors has been applied for the automatic prediction of temperatures in air conditioning systems (HVAC). These systems have a direct impact on both energy consumption and the comfort of buildings, so an accurate and reliable modelling of the temperature behavior constitutes the starting point for the development of energy efficiency plans. The use of regressors to predict the evolution of indoor temperature of buildings based on internal and external (climatic) conditions allows to evaluate the impact of the modifications in the HVAC systems from a comfort perspective. With the aim of developing an efficient model for HVAC systems, this thesis has evaluated 40 regressors, which belong to 20 different regressor families, using real data generated by an intelligent building, namely the Centro Singular de Investigación en Tecnoloxías da Información (CiTIUS) of the University of Santiago de Compostela (USC). In addition, different models based on neural networks which allow automatic re-training and on-line learning of new data have been developed and compared to the previous 20 off-line regressors. The ability of on-line learning provides robustness to the neural models and allows them to: 1) face circumstances never seen in training due to exceptional climatic situations; and 2) support alterations in the components of the systems produced by errors or changes in the sensor systems

    Selection of human evaluators for design smell detection using dragonfly optimization algorithm : An empirical study

    No full text
    Context: Design smell detection is considered an efficient activity that decreases maintainability expenses and improves software quality. Human context plays an essential role in this domain.Objective: In this paper, we propose a search-based approach to optimize the selection of human evaluators for design smell detection.Method: For this purpose, Dragonfly Algorithm (DA) is employed to identify the optimal or near-optimal human evaluator's profiles. An online survey is designed and asks the evaluators to evaluate a sample of classes for the presence of god class design smell. The Kappa-Fleiss test has been used to validate the proposed approach. Results: The results show that the dragonfly optimization algorithm can be utilized effectively to decrease the efforts (time, cost ) of design smell detection concerning the identification of the number and the optimal or near-optimal profile of human experts required for the evaluation process.Conclusions: A Search-based approach can be effectively used for improving a god-class design smell detection. Consequently, this leads to minimizing the maintenance cost
    corecore